# Low memory usage
Fastvlm 1.5B Stage3 MNN
Apache-2.0
FastVLM-1.5B-Stage3-MNN is a text generation model based on the Transformer architecture. It is an 8-bit quantized version of FastVLM-1.5B-Stage3, suitable for text generation scenarios such as chatting.
Large Language Model English
F
taobao-mnn
1,157
1
Qwen Qwen3 8B GGUF
Apache-2.0
Quantized version of Qwen3-8B, quantized using the imatrix option of llama.cpp, suitable for text generation tasks.
Large Language Model
Q
bartowski
23.88k
18
EXAONE 3.5 32B Instruct GGUF
Other
EXAONE-3.5-32B-Instruct is a large language model with 32B parameters, supporting instruction following and dialogue tasks.
Large Language Model Supports Multiple Languages
E
bartowski
616
9
Impish Mind 8B GGUF
Apache-2.0
Quantized version based on SicariusSicariiStuff/Impish_Mind_8B model, processed with llama.cpp tools for various quantization methods, suitable for text generation tasks.
Large Language Model English
I
bartowski
532
9
Esmplusplus Small
ESM++ is a faithful implementation of ESMC, supporting batch processing and compatible with the standard Huggingface interface without requiring the ESM Python package. The small version corresponds to the 300-million-parameter version of ESMC.
Protein Model
Transformers

E
Synthyra
6,460
14
FLUX.1 Lite GGUF
Other
Flux.1 Lite is a distilled 8-billion parameter Transformer model derived from FLUX.1-dev, optimized for text-to-image generation tasks, reducing memory usage while maintaining accuracy and improving speed.
Text-to-Image
F
gpustack
5,452
3
Featured Recommended AI Models